19 research outputs found
VLSI Circuits for Approximate Computing
Approximate Computing has recently emerged as a promising solution to enhance circuits performance by relaxing the requisite on exact calculations. Multimedia and Machine Learning constitute a typical example of error resilient, albeit compute-intensive, applications.
In this dissertation, the design and optimization of approximate fundamental VLSI digital blocks is investigated.
In chapter one the theoretical motivations of Approximate Computing, from the VLSI perspective, are discussed.
In chapter two my research activity about approximate adders is reported. In this chapter approximate adders for both traditional non-error tolerant applications and error resilient applications are discussed.
In chapter three precision-scalable units are investigated. Real-time precision scalability allows adapting the precision level of the unit with the precision requirements of the applications. In this context my research activities regarding approximate Multiply-and-Accumulate and memory units are described.
In chapter four a precision-scalable approximate convolver for computer vision applications is discussed. This is composed of both the approximate Multiply-and-Accumulate and memory units, presented in the chapter three
Approximate computing in the nanoscale era
The reduced benefits offered by technology scaling in the nanoscale era call for innovative design approaches, to process bigger and bigger amount of data with always higher performance and lower power consumption. In this respect, Approximate Computing constitutes one of the most promising trend, where efficiency is increased by breaking the dogma of error-free computations, enlarging the design space with the addition of application-specific quality metrics. Approximate Computing paradigm can be applied at different layers, spanning from software to systems and circuits. In this paper we focus on approximate arithmetic circuits for computer vision and machine learning. These kinds of applications have an excellent resiliency to computation errors. Moreover, their arithmetic-intensive processing makes the increase of efficiency of arithmetic circuits a keypoint. In this kind of context Approximate Computing can make the difference, improving, at the same time, performance and power consumption with tolerable quality degradation
Variable Latency Speculative Parallel Prefix Adders for Unsigned and Signed Operands
A variable latency adder (VLA) reduces average addition time by using speculation: the exact arithmetic function is replaced by an approximated one, that is faster and gives correct results most of the times. When speculation fails, an error detection and correction circuit gives the correct result in the following clock cycle. Previous papers investigate VLAs based on Kogge-Stone, Han-Carlson or carry select topologies, speculating that carry propagation involves only a few consecutive bits. In several applications using 2's complement representation, however, operands have a Gaussian distribution and a nontrivial portion of carry chains can be as long as the adder size. In this paper we propose five novel VLA architectures, based on Brent-Kung, Ladner-Fisher, Sklansky, Hybrid Han-Carlson, and Carry increment parallel-prefix topologies. Moreover, we present a new efficient error detection and correction technique, that makes proposed VLAs suitable for applications using 2's complement representation. In order to investigate VLAs performances, proposed architectures have been synthesized using the UMC 65 nm library, for operand lengths ranging from 32 to 128 bits. Obtained results show that proposed VLAs outperform previous speculative architectures and standard (non-speculative) adders when high-speed is required
Digital circuit for the generation of colored noise exploiting single bit pseudo random sequence
The generation of complex signal sources is important for test and validation of electronic systems. With reference to noise sources, commercial systems only provide white noise sources while the scientific literature only recently proposed circuits that generate programmable colored noise. This paper proposes a programmable colored noise generator that, while generating noise signals with features matching the state of the art, overcomes the previously proposed circuits in terms of speed (+10%) and logic resource occupation (-75%)
Variable latency speculative Han-Carlson adders topologies
Speculation can enhance adders performance by making carry predictions. It consists in replacing the arithmetic function with a faster, approximated, one, giving correct results most of the time. An error detection stage flags the misprediction events, in such cases, a two-cycles error correction stage is used, constituting a variable latency speculative adder. This paper proposes novel variable latency speculative adders based on Han-Carlson parallel-prefix topologies. The proposed adders are more effective than variable latency Kogge-Stone adders previously proposed in literature. A novel error detection technique that reduces error probability, compared to previous approaches, is proposed. Synthesis results, in the UMC 65nm library, show that proposed variable latency topologies outperform previously developed speculative Kogge-Stone adders and non-speculative ones, when high-speed is required. It is also shown that non-speculative adders remain the best choice when the speed constraint is relaxed
Variable Latency Speculative Han-Carlson Adder
Variable latency adders have been recently proposed in literature. A variable latency adder employs speculation: the exact arithmetic function is replaced with an approximated one that is faster and gives the correct result most of the time, but not always. The approximated adder is augmented with an error detection network that asserts an error signal when speculation fails. Speculative variable latency adders have attracted strong interest thanks to their capability to reduce average delay compared to traditional architectures.
This paper proposes a novel variable latency speculative adder based on Han-Carlson parallel-prefix topology that resulted more effective than variable latency Kogge Stone topology.
The paper describes the stages in which variable latency speculative prefix adders can be subdivided and presents a novel error detection network that reduces error probability compared to previous approaches.
Several variable latency speculative adders, for various operand lengths, using both Han-Carlson and Kogge-Stone topology, have been synthesized using the UMC 65nm library. Obtained results show that proposed variable latency Han Carlson adder outperforms both previously proposed speculative Kogge-Stone architectures and non speculative adders, when high-speed is required. It is also shown that non speculative adders remain the best choice when the speed constraint is relaxed
Single Bit Filtering Circuit Implemented in a System for the Generation of Colored Noise
The generation of complex signal sources is important for test and validation of electronic systems. With reference to noise sources, commercial systems usually provide white noise sources while the scientific literature only recently proposed circuits that generate programmable colored noise. This paper proposes a filtering circuit and an algorithm to design the same that produces an arbitrary colored electrical noise. The proposed system improves the performances of the previously proposed circuits in terms of spectral characteristics of the output, in terms of logic resource occupation and power dissipation, while providing no penalty on the working frequency
Approximate Multipliers Based on New Approximate Compressors
Approximate computing is an emerging trend in digital design that trades off the requirement of exact computation for improved speed and power performance. This paper proposes novel approximate compressors and an algorithm to exploit them for the design of efficient approximate multipliers. By using the proposed approach, we have synthesized approximate multipliers for several operand lengths using a 40-nm library. Comparison with previously presented approximated multipliers shows that the proposed circuits provide better power or speed for a target precision. Applications to image filtering and to adaptive least mean squares filtering are also presented in the paper